Boosted Mean Shift Clustering

نویسندگان

  • Ya-Zhou Ren
  • Uday Kamath
  • Carlotta Domeniconi
  • Guoji Zhang
چکیده

Mean shift is a nonparametric clustering technique that does not require the number of clusters in input and can find clusters of arbitrary shapes. While appealing, the performance of the mean shift algorithm is sensitive to the selection of the bandwidth, and can fail to capture the correct clustering structure when multiple modes exist in one cluster. DBSCAN is an efficient density based clustering algorithm, but it is also sensitive to its parameters and typically merges overlapping clusters. In this paper we propose Boosted Mean Shift Clustering (BMSC) to address these issues. BMSC partitions the data across a grid and applies mean shift locally on the cells of the grid, each providing a number of intermediate modes (iModes). A mode-boosting technique is proposed to select points in denser regions iteratively, and DBSCAN is utilized to partition the obtained iModes iteratively. Our proposed BMSC can overcome the limitations of mean shift and DBSCAN, while preserving their desirable properties. Complexity analysis shows its potential to deal with large-scale data and extensive experimental results on both synthetic and real benchmark data demonstrate its effectiveness and robustness to parameter settings.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

On Mean Shift Clustering for Directional Data on a Hypersphere

The mean shift clustering algorithm is a useful tool for clustering numeric data. Recently, Chang-Chien et al. [1] proposed a mean shift clustering algorithm for circular data that are directional data on a plane. In this paper, we extend the mean shift clustering for directional data on a hypersphere. The three types of mean shift procedures are considered. With the proposed mean shift cluster...

متن کامل

Efficient Mean-shift Clustering Using Gaussian KD-Tree

Mean shift is a popular approach for data clustering, however, the high computational complexity of the mean shift procedure limits its practical applications in high dimensional and large data set clustering. In this paper, we propose an efficient method that allows mean shift clustering performed on large data set containing tens of millions of points at interactive rate. The key in our metho...

متن کامل

A review of mean-shift algorithms for clustering

A natural way to characterize the cluster structure of a dataset is by finding regions containing a high density of data. This can be done in a nonparametric way with a kernel density estimate, whose modes and hence clusters can be found using mean-shift algorithms. We describe the theory and practice behind clustering based on kernel density estimates and mean-shift algorithms. We discuss the ...

متن کامل

On mean shift-based clustering for circular data

Cluster analysis is a useful tool for data analysis. Clustering methods are used to partition a data set into clusters such that the data points in the same cluster are the most similar to each other and the data points in the different clusters are the most dissimilar. The mean shift was originally used as a kernel-type weighted mean procedure that had been proposed as a clustering algorithm. ...

متن کامل

Clustering via Mode Seeking by Direct Estimation of the Gradient of a Log-Density

Mean shift clustering finds the modes of the data probability density by identifying the zero points of the density gradient. Since it does not require to fix the number of clusters in advance, the mean shift has been a popular clustering algorithm in various application fields. A typical implementation of the mean shift is to first estimate the density by kernel density estimation and then com...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2014